Coping with Two Different Transmission Channels in Language Recognition
نویسندگان
چکیده
This paper confirms the huge benefits of Factor Analysis over Maximum A-Posteriori adaptation for language recognition (up to 87% relative gain). We investigate ways to cope with the particularity of NIST’s LRE 2009, containing Conversational Telephone Speech (CTS) and phone bandwidth segments of radio broadcasts (Voice Of America, VOA). We analyze GMM systems using all data pooled together, eigensession matrices estimated on a per condition basis and systems using a concatenation of these matrices. Results are presented on all LRE 2009 test segments, as well as only on the CTS or only on the VOA test utterances. Since performances on all 23 languages are not trivial to compare, due to lacking language–channel combinations in the training and also in the testing data, all systems are also evaluated in the context of the subset of 8 common languages. Addressing the question if a fusion of two channel specific systems may be more beneficial than putting all data together, we study an oracle based system selector. On the 8 language subset, a pure CTS system performs at a minimal average cost of 2.7% and pure VOA at 1.9% minCavg on their respective test conditions. The fusion of these two systems runs at 2.0% minCavg . As main observation, we see that the way we estimate the session compensation matrix has not a big influence, as long as the language–channel combinations cover those used for training the language models. Far more crucial is the kind of data used for model estimation.
منابع مشابه
Reconnoitering the effective Channels of Monetary Transmission Mechanism in Iran Using a Dynamic Stochastic General Equilibrium Model
The purpose of the present research is to investigate the effective channels of the monetary transmission mechanism in Iran. To do so, we devised a New Keynesian Dynamic Stochastic General Equilibrium Model. In our model, the different types of nominal rigidities are introduced beside all the related structural equations, which are extracted and linearized around a steady state point. Furthermo...
متن کاملشدت استرس ادراکشده و راهبردهای مقابله با استرس
The recognition of determinants in individual's coping strategies with stressful situations to empower their abilities in management and control of stress, has been a substantial issue in behavioral studies. The aim of this study was to determine the role of perceived severity of stress in individual's coping strategies with stressful situations. Using data were collected from 373 students of Y...
متن کاملWritten word recognition by the elementary and advanced level Persian-English bilinguals
According to a basic prediction made by the Revised Hierarchical Model (RHM), at early stages of language acquisition, strong L2-L1 lexical links are formed. RHM predicts that these links weaken with increasing proficiency, although they do not disappear even at higher levels of language development. To test this prediction, two groups of highly proficie...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملA New Unequal Error Protection Technique Based on the Mutual Information of the MPEG-4 Video Frames over Wireless Networks
The performance of video transmission over wireless channels is limited by the channel noise. Thus many error resilience tools have been incorporated into the MPEG-4 video compression method. In addition to these tools, the unequal error protection (UEP) technique has been proposed to protect the different parts in an MPEG-4 video packet with different channel coding rates based on the rate...
متن کامل